Zero-Truncated Poisson Tensor Factorization for Massive Binary Tensors

نویسندگان

  • Changwei Hu
  • Piyush Rai
  • Lawrence Carin
چکیده

We present a scalable Bayesian model for lowrank factorization of massive tensors with binary observations. The proposed model has the following key properties: (1) in contrast to the models based on the logistic or probit likelihood, using a zero-truncated Poisson likelihood for binary data allows our model to scale up in the number of ones in the tensor, which is especially appealing for massive but sparse binary tensors; (2) side-information in form of binary pairwise relationships (e.g., an adjacency network) between objects in any tensor mode can also be leveraged, which can be especially useful in “cold-start” settings; and (3) the model admits simple Bayesian inference via batch, as well as online MCMC; the latter allows scaling up even for dense binary data (i.e., when the number of ones in the tensor/network is also massive). In addition, non-negative factor matrices in our model provide easy interpretability, and the tensor rank can be inferred from the data. We evaluate our model on several large-scale realworld binary tensors, achieving excellent computational scalability, and also demonstrate its usefulness in leveraging side-information provided in form of mode-network(s).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Scalable Probabilistic Tensor Factorization for Binary and Count Data

Tensor factorization methods provide a useful way to extract latent factors from complex multirelational data, and also for predicting missing data. Developing tensor factorization methods for massive tensors, especially when the data are binaryor count-valued (which is true of most real-world tensors), however, remains a challenge. We develop a scalable probabilistic tensor factorization frame...

متن کامل

Scalable Bayesian Non-negative Tensor Factorization for Massive Count Data

We present a Bayesian non-negative tensor factorization model for count-valued tensor data, and develop scalable inference algorithms (both batch and online) for dealing with massive tensors. Our generative model can handle overdispersed counts as well as infer the rank of the decomposition. Moreover, leveraging a reparameterization of the Poisson distribution as a multinomial facilitates conju...

متن کامل

Iterative Splits of Quadratic Bounds for Scalable Binary Tensor Factorization

Binary matrices and tensors are popular data structures that need to be efficiently approximated by low-rank representations. A standard approach is to minimize the logistic loss, well suited for binary data. In many cases, the number m of non-zero elements in the tensor is much smaller than the total number n of possible entries in the tensor. This creates a problem for large tensors because t...

متن کامل

On Tensors, Sparsity, and Nonnegative Factorizations

Tensors have found application in a variety of fields, ranging from chemometrics to signal processing and beyond. In this paper, we consider the problem of multilinear modeling of sparse count data. Our goal is to develop a descriptive tensor factorization model of such data, along with appropriate algorithms and theory. To do so, we propose that the random variation is best described via a Poi...

متن کامل

Nonnegative Tensor Factorization, Completely Positive Tensors, and a Hierarchical Elimination Algorithm

Nonnegative tensor factorization has applications in statistics, computer vision, exploratory multiway data analysis and blind source separation. A symmetric nonnegative tensor, which has an exact symmetric nonnegative factorization, is called a completely positive tensor. This concept extends the concept of completely positive matrices. A classical result in the theory of completely positive m...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015